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any  associated  compools,  and  a symbol  table  In  which  information  about  these 
identifiers  is  recorded  are  used  to  assist  in  the  statistics  collection  process. 

This  statistics  collector  is  expected  to  be  a valuable  tool  in  the  development 
of  JOVIAL  J3  and  other  programming  languages  by  providing  guinance  relative  to 
(1)  more  effective  methods  of  programming,  (2)  implementation  of  compilers  with 
greater  efficiency,  and  (3)  possible  language  changes. 

This  report  also  includes  a summary  of  the  syntax,  semantics,  and  computer 
system  interface  errors  made  by  the  Implementor  in  the  process  of  development 
of  the  software  package.  The  visibility  provided  by  this  information  is 
expected  to  increase  understanding  of  the  nature,  causes,  and  methods  of 
avoidance  of  software  errors. 
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A STATISTICS  COLLECTION  PACKAGE  FOR  THE  JOVIAL  J3  PROGRAMMING  LANGUAGE 


Rabert  E.  Stover,  Jr. 


1.0  INTRODUCTION 

The  JOVIAL  J3  Statistics  Collector,  produced  under  an  in-house  effort  by 
the  Rome  Air  Development  Center,  is  a software  p>ackage  designed  to 
measure  usage  of  the  various  constiMcts  and  features  of  the  JOVIAL  J3 
higher-order  oonputer  programning  language.  The  statistics  collector 
takes  as  input  a JOVIAL  J3  source  program,  and  operates  ipon  this  code 
to  obtain  various  statistical  quantities  about  the  program.  It  is  hoped 
that  the  statistical  information  derived  by  this  package  will  enable 
more  effective  programning  in  the  JOVIAL  J3  langucige,  greater  efficiency 
in  ijiplementation  of  JOVIAL  J3  oonpilers,  and  suggestions  regarding 
possible  language  changes.  Since  JOVIAL  J3  has  been  the  official  Air 
Force  language  for  oomnand  and  control  applications,  it  is  an  ideal 
language  for  such  aialysis. 

This  statistics  collector  is  currently  hosted  and  running  on  the  HIS 
600/6000  QCOS  oonputer  system  at  RADC,  but  has  been  designed  to  be  as 
system  independent  as  possible.  It  processes  JOVIAL  J3  as  described  by 
MILrSTD-1588  (USAF),  30  June  1976,  with  certain  features  added  for 
conpatihility  with  tte  JOVIAL  Conpiler  Iirplementation  Tool  (JOCIT). 

A sumnary  of  basic  facts  about  the  JOVIAL  J3  Statistics  Collector 
app>ears  in  Table  1. 


2.0  HISTORY 

A brief  sumnary  of  key  events  in  the  development  of  the  JOVIAL  J3 
Statistics  Collector  is  presented  here.  Since  the  productivity  of  the 
programner  and  implementor,  expressed  in  lines  of  code  generated  as  a 
function  of  man-hours  of  work,  is  of  interest,  this  information  is 
presented  in  Graph  1. 

2.1  Early  Work 

Construction  of  ths  JOVIAL  J3  Statistics  Collector  began  in  July  1974 , 
after  a need  for  statistical  information  about  the  language  became 
evident  within  the  Air  Force.  The  first  task  undertaken  involved  the 
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execution  of  a hash  coding<*)  routine  with  a given  list  of  names  as 
input;  no  processing  of  an  actual  program  was  performed. 


In  November  1974,  a design  for  the  statistics  collector  consisting  of  a 
conpool,  main  program,  routine  to  return  tokens  of  the  language,  and 
output  routine  was  put  onto  the  sjystem.  At  first  the  token  returning 
routine  was  rterely  a prearranged  list  of  tokens,  but  soon  an  actual 
operaticnal  scanner  was  implemented.  These  four  modules  correspond  to 
the  present  STCOMP,  STATCO,  GEITOK,  and  PUTOUT. 

2.2  Process  of  Major  Development 

Once  its  basic  structure  had  been  established,  the  statistics  collector 
was  progressively  beefed  up.  At  first  the  package  only  yielded 
information  on  nuniers  of  types  of  statements  and  perhaps  one  or  two 
other  miscellaneous  items,  but  in  December  1974,  a subroutine  to  process 
assignment  statements  and  count  various  cperators  and  operands  contained 
therein  was  inplemented.  In  January  1975,  hash  coding  of  identifiers  was 
restored  as  part  of  the  package,  using  tdie  independently  compiled 
subroutine  TOKHSH. 

Work  cn  the  statistics  collector  was  then  suspended  for  six  months.  One 
of  the  first  tasks  conducted  upon  resumption  of  the  effort  was  the 
detection  and  counting  of  various  types  of  declarations.  Also,  the 
recompilation  of  the  entire  statistics  collector  every  time  a change  was 
made  to  any  part  of  it  was  found  to  be  wasteful  and  expensive,  so  a 
procedxire  was  established  for  saving  the  object  code  of  modules  on 
peimfiles  on  the  GCOS  conputer  system  and  only  recompiling  modules 
affected  by  changes  prior  to  a run.  Since  changes  to  a oorrpool  generally 
require  recompilation  of  any  referencing  programs,  it  was  decided  to 
move  the  statistical  data,  only  referenced  by  STATCO  and  PUTOUT,  into  a 
new  conpool  COMPST,  requiring  recompilation  of  only  this  oompool  and 
these  two  execut^le  modules  whenever  the  statistical  data  base  was 
revised. 

In  October  1975,  a symbol  table  containing  information  about  identifiers 
was  incorporated  into  the  statistics  collector,  and  this  capability  h^ls 


(*)  Hash~oDding  is  a method  by  vdrich  a name  is  assigned  to  a unique 
location  in  a table  having  n entries  by  determining  the  value  jmodulo  n 
of  scare  integer  function  of  the  name,  such  as  the  sum  of  the  machine 
representations  of  the  words  oontaining  it.  If  the  entry  determined . for 
a given  name  is  already  occupied  by  another  name  (this  is  called  a 
collision),  a quadratic  function  is  applied  with  successive  integers, 
and  this  amount  £rided  to  the  original  location,  to  determine  a new 
table  entry  until  a vacant  position  for  the  name  can  be  found  in  the 
table.  When  this  name  is  a^in  referenced  by  the  program,  this 
procedure  can  be  repeated  to  find  where  it  was  originally  stored, 
precluding  the  necessity  of  searching  throu^  most  of  the  table  for  the 
name. 
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been  progressively  upgraded  to  the  present.  The  hash  coding  output 
routine,  previously  called  HSHOUT,  was  renamed  TABOUT,  and  now  printed 
both  the  hash  coding  and  symbol  t^les. 

The  inclusion  of  a symbol  table  prorrpted  the  revision  of  the  statistics 
collectca?  to  a two-pass  structure.  This  was  done  for  a vAule  by  maJcing 
GETTOK  the  entire  first  pass,  fetching  all  of  the  tokens  upon  one  call 
from  STATOO  and  creating  the  symbol  table,  while  STATCO  performed  the 
second  pass  statistical  processing.  This  was  found  to  be  inefficient,  so 
a new  subroutine  module,  PASSl,  was  called  once  from  STATU),  and  Ccilled 
GETTOK,  v^ch  had  been  reverted  to  yielding  one  token  per  call,  whenever 
the  next  token  in  the  input  stream  was  needed.  At  this  point,  ill  eight 
independently  oonpiled  modules  of  the  present  statistics  collector 
structure  were  in  place  (see  Section  3.0). 

Prom  this  point  until  major  testing  of  the  package  was  begun  in  June 
1976,  most  of  the  work  consisted  of  refinement  of  the  statistics 
collector  and  increasing  the  types  of  statistical  quantities  obtained. 
TWo  najor  extensions  which  deserve  mention  are  the  implementation  of 
ccmpool  resolution  of  identifiers  and  the  processing  of  define 
directives  and  substitutions. 

2.3  Recent  Testing  and  Debugging 

During  most  of  the  process  of  its  development,  the  statistics  collector 
was  run  with  its  own  modules  as  input.  Hcwever,  during  June,  July,  and 
December  1976,  the  JOVIAL  J3  Compiler  Validation  System  (JCVS-J3)  and 
certain  software  produced  by  Strategic  Air  Command  (SAC)  headquarters 
were  used  as  input  to  the  package.  Previously  undetected  errors 
uncovered  by  this  process  included  failure  to  handle  unnamed  tables; 
wrong  detection  of  control  transfer  and  switch  names  in  some  instances; 
and  incorrect  symbol  table  type  assignments  from  item  descriptions.  The 
utility  of  the  SAC  software  was  limited  by  lack  of  access  to  the 
corpools  referenced  by  those  programs. 


3.0  DESCRIPTION  OF  MODULES 

The  JOVIAL  J3  Statistics  Collector  currently  consists  of  eight 
independently  compiled  modules:  two  oompools,  one  driver  program, -and 

five  procedures  performing  various  functions  within  the  package. 
Reference  has  already  been  made  in  Sections  2.1  and  2.2  to  the 
historical  development  of  each  of  these  modules.  They  are  individually 
described  below,  and  their  logiccuL  interrelationship  cppears  in  Table  2. 
Table  3 lists  the  size  of  each  module,  and  summarizes  its  funcxion. 

3.1  STCOMP 

ST(X)MP  is  the  main  oompool  of  the  statistics  collector.  It  contains 
declarations  of  simple  items,  tables,  arrays,  files,  and  procedures  used 
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during  execution,  as  well  as  a few  declarations  of  statistical  items 
calculated  by  tile  scanner  GETTOK.  It  is  referenced  by  all  executable 
modules. 

3.2  COMPST 

COMPST  is  the  corpool  containing  declarations  of  most  statistical 
quantities.  It  is  only  referenced  by  STATCO,  where  these  values  are 
calculated,  and  PUTOUT,  where  they  are  output. 

3.3  STATCO 

STATCO  is  the  driver  FTOgram  of  the  package.  It  calls  PASSl  to  p^form 
the  first  pass,  then  does  the  statistical  counting  and  calculation  of 
the  second  pass  itself.  After  this  is  conpleted,  it  calls  output 
routines  PUTOUT  and  TABOUT.  It  contains  closes  to  collect  statistics 
about  arithmetic  expressions,  named  tables,  unnamed  tables,  and  arrays, 
and  subroutines  to  determine  type  of  assignment  and  to  collect 
statistics  about  item  description  types. 

3.4  PASSl 

PASSl  conducts  the  first  pass  of  statistics  collection,  which  consists 
prinarily  of  symbol  table  creation.  It  calls  GETTOK  to  obtain  the  next 
input  token  when  needed,  enploying  token  lookahead  in  some  instances.  It 
also  calls  TOKHSH  to  hash  code  identifiers.  PASSl  contains  a close  for 
assi^ing  symbol  table  name,  scop>e,  and  cleiss  to  identifiers , and 
subroutines  to  discern  between  unsigned  and  signed  integer  and  fixed 
items  and  to  create  symbol  table  entries  for  like  tables. 

3.5  GETTOK 

GETTOK  is  the  scanner.  It  performs  essentially  as  a finite  state 
machine,  scanning  the  stream  of  input  characters  until  a token  of  the 
language  is  detected.  In  the  case  of  a define  name  reference,  the  s^ 
can  be  diverted  to  the  appropriate  define  string.  GETTOK  contains 
subroutines  for  assigning  a numerical  value  to  each  character  and  for 
detecting  and  handling  the  end  of  an  input  line  or  define  string. 

A list  of  the  tokens  of  the  JOVIAL  J3  language,  all  of  vhich  are  now 
detected  by  GETTOK,  appears  in  Table  4. 

3.6  TOKEiSH 


TOKtISH,  the  hash  coding  routine,  uses  a quadratic  hash  coding  algDrithm 
which  places  each  distinct  compool  and  source  program  identifier  at  a 
unique  position  in  the  hash  table,  and  sets  i5>  a symbol  table  reference 
for  the  identifier.  It  contains  subroutines  to  determine  whether 
identifiers  of  identical  spelling  are  in  fact  the  same  or  differ^t  wi"^ 
regard  to  scope  and  context,  and  to  determine  how  an  identifier  is 
declared. 
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3.7  PUTOUr 


PUTOUT  is  Ccilled  by  STATCO,  and  prints  a sumnary  of  the  statistical 
infornation  collected  and  evaluated  by  the  package. 

3.8  TABOUT 

TABOUT,  also  called  ty  STATCO,  prints  the  hash  coding  and  syTubol  tables 
for  the  input  program,  including  any  associated  oonpools. 


4.0  DOCUMENTATION 

Written  information  about  the  JOVIAL  J3  Statistics  Collector  and  its 
constmctioi  and  inplementation  has  been  maintained  in  various  forms  for 
a number  of  aspects  of  the  effort.  These  are  listed  below,  and  their 
structure  is  poirtrayed  in  Tamale  5.  The  number  of  pages  in  each  document 
spears  in  Taile  6. 

4.1  Descriptions  and  Reports 

A considerable  amount  of  general  information  about  the  statistics 
collector  las  been  provided  at  various  times  during  the  course  of  the 
project.  A brief  sunmary  of  such  documents  is  presented  here. 

Early  in  the  effort,  a brief  review  of  the  nature  and  purpose  of  the 
statistics  collector  was  set  forth  in  "JOVIAL  J3  Statistics  Collector," 
dated  10  September  1974.  This  was  acconpanied  ty  "Statistics  Collector 
Conponents,"  an  outline  of  statistics  expected  to  be  gathered  by  the 
package,  bearing  the  same  date.  This  report  and  outline  both  reside  on 
the  HIS  600/6000  Multics  oonputer  system  at  RADC. 

An  early,  and  now  somewhat  out  of  date,  design  plan  for  the  statistics 
collector  spears  in  "Overall  Design  of  the  JOVIAL  J3  Statistics 
Collector."  The  date  of  this  paper  is  uncertain,  but  is  probably  ^md 
October  1974.  A handwritten  copy  of  this  report  exists  in  the  statistics 
collector  records,  but  it  has  never  been  put  onto  a conputer  system. 

Two  interim  reports  of  progress  in  the  development  of  the  statistics 
collector  have  been  produced.  The  first,  "Status  of  the  JOVIAL  J3 
Statistics  Collector,"  dated  13  January  1975,  resides  on  the  QCOS 
computer  system  at  RADC,  and  has  been  distributed  to  some  extent.  This 
report  suimarizes  early  work  on  tiie  package,  discusses  its  structure  in 
some  detail,  and  provides  an  analysis  of  errors  made  up  to  that  point. 
This  material  was  updated  in  "Second  Interim  Report  on  the  JOVIAL  J3 
Statistics  Collector,"  dated  26  ^fa^ch  1976,  vhich  resides  on  the  Multics 
conputer  system  at  RADC,  but  has  never  been  formally  published. 

An  RADC  Status  Report,  Form  77b,  dated  1 October  1976,  describes  the 
state  of  the  statistics  collector  effort  as  of  that  date  in  handwritten 
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tatular  form,  and  is  an  the  statistics  collector  records. 

It  is  expected  that  a brief  r^x>rt  will  be  produced  at  the  end  of  the 
entire  statistics  oollector  project,  vjpdating  this  RADC  Technical  Report 
and  sunrarizing  the  fined  work  on  tte  effort. 

4.2  List  of  Highlights 

rhis  is  a chronologioal  listing  of  ihe  najor  milestones  and  extensions 
of  capability  for  the  statistics  collector  from  its  inception  to  the 
present. 

4.3  Daily  Diary 

This  is  a day-by-day  account  of  work  performed  on  the  statistics 
collector,  and  other  related  events,  from  23  September  1974  to  the 
present.  Monthly  sunmaries  describe  work  acccnplished  prior  to  the 
inception  of  daily  accounting. 

4.4  Time  and  Cost  Information 

Hours  spent  working  cn  the  statistics  oollector,  nurrfcer  of  terminal 
sessions,  total  terminal  connect  time,  terminal  cost,  number  of  batch 
runs,  toted  processor  time  for  batch  runs,  batch  cost;  and  total 
computer  cost  are  recorded  for  each  date  on  which  work  was  performed  on 
the  statistics  oollector  from  15  October  1974  to  liie  present.  Table  7 
contains  a sumnary  of  this  information  on  a monthly  basis. 

4.5  Error  Data 

This  is  a record  of  all  programmer  errors  made  cxi  the  package  since  1 
October  1974,  and  all  ooirpiler  and  system  errors  occurring  since  its 
inception.  Rrior  to  30  September  1975,  accounts  of  errors  were  written 
up  in  paragraph  form;  since  1 October  1975,  they  have  been  recorded  in 
tabular  form.  Programmer  errors  have  been  subdivided  into  six 
categories;  forgetfulness,  logic,  data  management,  subroutine  linkage, 
input/output,  and  resource  allocation.  Aspects  of  errors  considered 
include  criticality,  relationship  to  attempted  correction  of  previous 
errors,  and  numiber  of  runs  required  to  fix.  A summary  of  the  error 
information  obtained  so  far  in  the  project  appears  in  Table  8. 

Reclassification  of  these  errors  based  cn  the  categories  established  by 
TRW  systems  Group  in  a recent  study  of  software  reliability  (1)  has  been 
acooirplished.  The  results  of  this  reclassification  appear  in  Table  9. 
Further  discussion  of  error  analysis  appears  in  Section  6. 

4.6  Computer  Output 

Listings  of  terminal  sessions  and  batch  runs  have  been  saved  since  quite 
early  in  the  effort.  These  listings  are  not  complete,  but  cover  most  of 
the  significant  work  done  on  the  statistics  oollector. 
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5.0  RESULTS  AOUEVin)  TO  l^YTE 
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As  is  indicated  in  Table  10,  the  statistics  collector  now  possesses  an 
extensive  capability  for  describing  usage  of  JOVIAL  J3  constructs  and 
feat\a?es  by  programmers.  Counts  and  percentages  of  both  declaration  and 
executable  statement  types  are  provided.  A further  breaMown  of  tokens 
in  arithnetic  expressions  is  derived.  Information  such  as  nuirber  of 
lines  in  lhe  program,  average  line  length,  average  comment  length,  etc., 
is  also  calcuQ-ated. 

No  reed  analysis  of  the  statistical  data  obtained  by  rrinning  various 
programs  throu^  the  statistics  collector  has  yet  been  conducted.  In 
fact,  some  of  the  programs  used  as  irput  are  not  truly  representative  of 
JOVIAL  J3  software,  so  the  applicability  of  such  an  analysis  might  be 
limited.  Eiowever,  it  has  been  noticed  that  the  use  of  structi^d 
programning  technology  has  been  somewhat  lac3cLng  (data  not  explicitly 
declared,  IF  statements  used  in  preference  to  IFEITH,  many  GOTO 
statements,  etc.)  Another  observation  is  that  certain  programmers  tend 
to  completely  avoid  using  particular  features  of  the  language,  such  as 
arreys  or  exchange  statements. 

An  example  of  an  actual  output  listing  from  execution  of  the  statistics 
collector,  including  haish  coding  and  symbol  tables,  appears  in  Table  11. 
The  first  peiss  and  syinbol  table  routine  PASSl  of  the  statistics 
collector  itself,  with  reference  to  main  statistics  collector  compool 
STCOMP,  was  used  as  source  program  input  in  this  case. 


6.0  ANALYSIS  OF  ERRORS 

As  mentioned  in  Section  4.5,  the  source  and  compiler  errors  made  in  the 
process  of  coding  the  statistics  collector  have  been  categorized  both  in 
terms  of  the  implementor’s  own  classification  scheme,  and  that  devised 
by  ITW  Systems  Groip)  (2).  The  reasm  for  going  to  such  great  len^hs  to 
analyze  these  errors  is  that  it  was  felt  that  this  effort  provided  a 
ready-made  opportunity  to  observe  the  Gorwnission,  discovery,  and 
correction  of  errors  during  the  actual  software  development  process.  It 
is  hoped  th^rt:  investigation  of  this  data  and  its  oorparison  with  that 

from  other  software  projects  will  provide  meaningfiil  insist  into  the 
problem  of  software  errors. 

As  indicated  jreviously.  Tables  8 and  9 summarize  this  data.  A further 
analysis  of  each  of  the  two  error  classifications  follows  in  Sections 

6.1  and  6.2. 

Another  important  aspect  of  source  errors  is  the  relationship  between 
the  total  number  of  errors  and  the  lines  of  code  produced.  Graph  2 
expresses  this  relationship. 

6.1  Implementor's  Classification 
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As  stated  in  Secticsn  4.5,  this  classification  consists  of  six  categories 
of  user  errors,  plus  a separate  oategpry  for  oonpiler  or  system  errors. 
The  most  prevalent  categories  of  user  errors  are  s^en  to  be  logic  and 
data  management.  This  reflects  the  extensive  logical  branching  required 
to  identify  langucige  oonstructs  from  the  sovirce  code  and  tokens,  and  the 
extensive  manipulation  required  to  operate  upon  data  entities, 
particularly  in  relation  to  the  symbol  table.  Subroutine  linkage  errors 
are  fewer  in  niaiiber,  but  are  more  often  criticeil  (causing  cill  or  most  of 
the  statistics  collector  to  fail  to  operate  properly),  as  would  be 
e:5)ected.  Resource  allocation  errors  are  almost  always  critical,  for  the 
package  cannot  really  run  without  the  provision  of  required  ^ace  or 
time.  Nearly  20%  of  all  user  errors  are  related  to  attempts  to  correct 
previous  errors;  this  is  considered  a high  figure. 

6.2  TRW  Classification 

As  is  seen  in  Table  9,  the  errors  are  divided  among  20  categories  (3),  4 
of  which  are  considered  non-applicable  to  this  effort  for  1d>e  reasons 
stated.  As  with  the  previous  classification,  logic  and  data  handling  are 
the  most  prevalent  error  categories,  and  errors  related  to  the  global 
operation  of  the  software  tend  to  be  more  criticcd.  The  "Recurrent” 
category  is  a catch-edl  for  all  errors  resulting  from  another  attenpted 
error  correction,  and  these  errors  are  not  further  classified,  except 
with  regard  to  criticality.  Oonpiler  or  system  errors  are  generally 
included  in  the  "Operating  System/System  Support  Software"  category. 

The  total  number  of  errors  included  in  this  second  classification  is 
sli^tly  greater  than  that  in  the  first  because  some  errors  were  made 
between  the  times  of  the  two  classifications , and  also  because  some 
single  errors  in  the  first  classification  were  considered  multiple 
errors  for  purposes  of  the  second. 


f 7.0  RJTURE  PLANS 

The  statistics  collector  is  now  pretty  well  conpleted,  with  roost  of  the 
desired  capability  having  been  incorporated  into  the  package.  Tl'<ere  are 
a few  features  yet  to  be  added.  These  include  collection  of  further 
information  about  IF,  IFEITH,  and  FOR  statements,  and  the_  levels  of 
nesting  of  such  statements;  increased  analysis  of  the  right  side  of 
assignment  statanents;  treatment  of  arithmetic  ejpressions  in  contexts 
besides  assignment;  arvd  recording  of  frequency  of  uscige  of 
inplementation  procedures  and  functions. 

It  is  also  planned  to  modify  the  statistics  collector  to^  rerord 
information  about  a large  number  of  programs  in  a data  base  maintained 
by  the  host  system.  Currently,  inforrotiOTi  can  only  be  obtained  about 
one  program  at  a time,  and  is  lost  after  output.  By  replacing  the 
process  of  outputting  information  about  a single  program  with  that  of 
using  this  infornation  to  update  a continuously  existing  data  base,  a 
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statisticcil  suniiBiy  of  as  neny  prog^airs  as  desired  can  be  maintained. 


8.0  CONCLUSION 

As  has  already  been  indicated,  the  JOVIAL  J3  Statistics  Collector  hcis 
been  in  the  process  of  development  for  seme  time,  and  is  new  approaching 
ocMipletion.  The  value  of  this  tool  to  the  Air  Force  and  others  cannot  be 
fully  appraised  until  it  is  run  against  a significant  number  of 
representative  yet  diverse  JOVIAL  J3  programs.  However,  the  specific 
results  derived  from  the  limited  inputs  applied  to  the  package  to  date 
give  promise  of  its  increased  utility  in  the  future.  The  task  of  its 
oonstructicn  has  been  a valuable  lectrning,  experience  for  the 
ijiplementor,  both  in  regard  to  the  JOVIAL  J3  language  and  to  software 
design  cind  development  procedures . As  pointed  out  in  Sections  4 . 5 and 
6.0,  the  software  error  information  collected  throughout  most  of  this 
effort  is  expected  to  significantly  increase  understanding  in  t;iis  area. 

In  short,  the  JOVIAL  J3  Statistics  Collector  has  already  enhanced  a 
number  of  aspects  of  software  understanding  witiiin  the  Air  Force,  and 
will  do  so  in  even  greater  measure  in  the  future.  It  will  prove  to  be  a 
highly  useful  tool  in  assisting  the  Air  Force  in  various  phases  of 
JOVIAL  J3  and  software  development. 
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start  date:  July  1974 

Periods  during  work  conducted: 


July  1974  through  January  1975 
July  1975  through  July  1976 

I December  1976  throu^  January  1977 

i 

j Total  time  ejqjenditure;  7 nan/ months 

f 

I Total  cost  of  cxmputer  (HIS  600/6000  QOOS)  usage  since  15  Oct  74: 

I $7366.82 

t 

Number  of  independently  conpiled  modules:  8 

[ 

Lines  of  source  code  as  of  21  Dec  76:  4023 

User  coding  and  implementation  errors,  1 Oct  74  - 21  Dec  76:  285 
P^lges  of  written  docunentation  as  of  21  Dec  76:  307 
Approximate  percentage  of  code  employing  top-down  design:  90% 
Approximate  percentage  of  code  employing  structured  progranming:  95% 
Percentage  of  ocxie  written  in  JOVIAL  J3:  100% 

Table  1:  Facts  about  JOVIAL  J3  Statistics  Collector 


n 


F 


Subroutine  Calling  Sequence 


Table  2.  Interconnection  of  Statistics  Collector  Modules 


Module 

IVinction 

Size  in 
Lines 

STCOMP 

Main  oonpool 

226 

COMPST 

Conpool  oont£uning  statistical  data 

213 

STATOO 

Driver  program 

810 

PASSl 

First  pass,  symbol  table  creation 

931 

GETTOK 

Scanner,  token  detector 

754 

TOKHSH 

Hash  coding 

287 

PUTOUT 

Statisticcil  output 

577 

TABOUT 

Hash  coding  and  symbol  table  output 

225 

Total  nvjnber  of  lines 

4023 

Average  nvmiber  of  lines  per  nodule 

503 

Table  3;  Size  and  RaTCticin  of  Statistics  Collector  Modules 


I.  Tokens  identifying  type  of  statement  or  clause 

IF  FOR  RETURN 

IIFITH  GOTO  TEST 

ORIF  STOP  ASSIGN 


II.  Tokens  identifying  type  of  declaration 

irm  CLOSE 

STRING  PROGRAM  or  'PROGRAM  (independently  oonpiled  close) 

TABLE  ARRAY 

HLE  coma 

PROC  MODE 

OVERLAY  SWITCH 

DEFINE  MONITOR 

III.  Single  letter  tokens  indicating  declaration  attributes 
A (fixed  item,  or  integer  if  precision  not  specified) 

B (boolean  item  or  binary  file) 

C (ASCn  item) 

D (dense  table  packing) 

F (floating  item) 

H (hoUerith  item  or  file) 

I (integer  item) 

L (like  table) 

M (medium  table  packing) 

N (no  table  packing) 

P (item  preset  or  paralell  table  structure) 

R (rounding,  rigid  table  size,  or  fixed  length  file) 

S (signed  or  status  item,  or  serial  table  structure) 

T (transmission  oode  item) 

U (unsigned  itan) 

V (variable  table  size,  or  variable  length  file) 
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IV.  Statement  car  cJaaracter  string  delimiters 
$ (statement  or  decJaration  terminator) 

BEGIN,  END  (cxmpound  statement,  grouped  decd.aration,  or  oonstant 
list  delimiters) 

START,  TERM  (independent  ccanpilaticxi  delimeters) 

DIRECT,  JOVIAL  (direct  code  delimiters) 

()  (parentheses) 

($  $)  (subscadpt  or  BIT  or  ERfTE  peurameter  delimiters) 

, (oonma) 

...  (range  limit  separator  in  item  descriptions) 

. (period,  statement  name  terminator) 


Arlthnetic,  ralationad. 

or  logicaal  operators 

A.  Arithnetic  and  assignment  operators 

==  (exchange) 

ft 

/ 

+ 

**  (eaqaoraentiation) 

- 

(*  *)  (eaqxanentiatican  bracakets) 

B.  Relational  operators 

EQ  (equal) 

LQ  (less  than  or  equal) 

NQ  (not  equed) 

GR  (greater  than) 

LS  (less  than) 

QQ  (greater  than  or  equal) 

C.  logical  operatcars 

AND 

OP 

NOT 

VI.  Variables  and  cxrastants 


Identif ier 
FDR  parameter 
Integer  constant 
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Hollerith  cxaastant 
Trananussion  caode  cxanstant 
ASCII  caonstant 


VII.  Built-in  functions 


BIT 

DJT  or  ENTRY 

EPfTE 

IDC  or  'LOG 

(3iAR 

NENT 

MANT 

NWDSEN 

ABS 

ODD 

(/  /)  (absolute  value  brackets)  POS 
ALL 


VIII.  Ir^ut/output  primitives 

OPEN  INPUT 

(xrrpuT 

Table  4:  Tokens  of  l^ie  JOVIAL  J3  Lcaiguape 


Information 


Table  2.  Structure  of  Documentation  about  Statistics  Collector 


Item 


Pages 


Listings  of  Soiorce  Code  80(1) 

"JOVIAL  J3  Statistics  Collector"  1 

"Statistics  Collector  Conponents"  2 

"Overall  Design  of  the  JOVIAL  J3  Statistics  Collector"  2 

January  1975  Interin  Report  13 

March  1976  Interim  Report  6 

Early  Sairple  Output  Listing  1 

Standard  RADC  Status  Report  2 

List  of  ilighlights  4 

Diary  108 

Time  and  Cost  Accounting  Information  12 

List  of  Errors  80 

Total  Number  of  Pages  of  Documentation  311 


(1)  Based  on  50  lines  of  source  code  per  page  (code  totals  4023  lines). 


Table  6:  Available  Written  Documentation  Regarding  the  JOVIAL  J3 
Statistics  Collector  as  of  21  December  1976 
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Month 

HourB 

Terminal  Usag^ 

Batch  Runs 

Total 

Spent 

Sessions  Time(l)  Cost 

# 

Time(2) 

Cost 

Cost 

OctoberO) 

46 

8 

.812 

$6.52 

8 

.1113 

$60.68 

$67.20 

November 

86 

30 

9.464 

63.31 

26 

.1585 

167.70 

231.01 

December 

44 

21 

6.465 

45.27 

17 

.2792 

212.27 

257.54 

1974  Totals(4} 

• 176 

59 

16.741 

$115.10 

51 

.5490 

$440.65 

$555.75 

January 

51 

13 

4.633 

$50.21 

8 

.3740 

$293.49 

$343.70 

February 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

March 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

/^il 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

May 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

June 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

July 

28 

9 

3.689 

26.45 

14 

.1554 

96.94 

123.39 

August 

63 

37 

9.172 

69.70 

32 

.5506 

370.75 

440.45 

September 

81 

63 

12.880 

99.88 

49 

.6800 

445.21 

545.09 

October 

76 

22 

8.728 

65.38 

19 

.5982 

224.05 

289.43 

November 

48 

15 

3.579 

30.45 

11 

.4132 

144.06 

174.51 

December 

102 

46 

17.298 

136.91 

41 

3.2360 

933.64 

1070.55 

1975  Totals 

449 

205 

59.979 

$478.98 

174 

6.0074 

$2508.14 

$2987.12 

January 

80 

23 

11.400 

$85.57 

20 

.9993 

$290.54 

$376.11 

February 

63 

30 

10.497 

82.28 

29 

.4693 

556.65 

638.93 

March 

73 

22 

5.876 

49.05 

21 

.3107 

432.19 

481.24 

April 

44 

25 

9.657 

73.92 

21 

.3221 

280.10 

354.02 

^fay 

61 

20 

9.402 

68.34 

17 

.2279 

232.09 

300.43 

June 

125 

109 

29.610 

239.14 

77 

1.2700 

940.82 

1179.96 

July 

9 

4 

1.725 

12.36 

8 

.0576 

46.75 

59.11 

August 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

September 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

Octc±>er 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

November 

0 

0 

.000 

.00 

0 

.0000 

.00 

.00 

December 

73 

38 

10.968 

101.65 

33 

.4146 

332.50 

434.15 

1976  Totals 

528 

271 

89.135 

$712.31 

226 

4.0715 

$3111.64 

$3823.95 

Grand  Totals 

1153 

535 

165.855  $1306.39 

451 

10.6279 

$6060.43 

$7366.82 

(1)  Clock  time. 

(2)  Central  Processing  Unit  (CRI)  time. 

(3)  From  15  October  1974  through  end  of  inanth. 

(4)  Frsm  15  October  1974  through  end  of  year. 


Table  7;  Statistics  Collector  Time  and  Cost  Ejqjenditures , 
15  October  1974  - 31  December  1976 
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User  Enx>rs 


Category 

Total  if 
of  Errors 

Critical  Eirorsd)  Secondary 
# % # 

Errors  ( 2 ) 
% 

% of  all 
User  Errors 

Forgetfulness  49 

28 

57.14 

6 

12.24 

17.19 

Logic 

119 

22 

18.49 

22 

18.49 

41.75 

Data 

^knageInent 

80 

35 

43.75 

16 

20.00 

28.07 

Subroutine 

Linkage 

19 

13 

68.42 

7 

36.84 

6.67 

Input/ 

Output 

10 

1 

10.00 

1 

10.00 

3.51 

Resource 

Allocation 

8 

8 

10.00 

0 

0.00 

2.81 

Totals 

285 

107 

37.54 

52 

18.25 

100.00 

Gorrpiler  or  System  Zi?rors(3) 

Critical  13 

Non-critical 
Total  17 


(1) Errors  causing  most  or  all  of  the  statistics  collector  to  fail  to 
operate  properly. 

(2) Errors  caused  by  attemirted  oorrection  of  another  error. 

(3) Collection  of  data  begun  at  inception  of  project  in  July,  1974. 


Table  8:  Statistics  Collector  Errors  from  1 October  1974  throu^  21 
December  1976,  Using  Inplementor's  Own  Classification 
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Error 

Number  of  Elrrors 

% of 

% of 

Cate- 

Non- 

Critical 

Total  # 

gory 

Description  of  Category  Critical  Critical  Total 

Errors  of  Errors 

1 

Computational 

5 22 

27 

18.52 

7.92 

2 

Logic 

29  78 

107 

27.10 

31.38 

3 

Input/Output 

5 10 

15 

33.33 

4.40 

4 

Data  Handling 

26  34 

60 

43.33 

17.60 

5 

Operating  System/ 
System  Support  Software 

13  4 

17 

76.47 

4.99 

6 

Configuration 

0 0 

0 

0.00 

0.00 

7 

Routine/ 

Routine  Interface 

8 7 

15 

53.33 

4.40 

8 

Routine/ 

System  Software  Interface 

2 3 

5 

40.00 

1.47 

9 

Tape  Processing  Interface 

Not  applicable  because  magnetic  tapes 

not  directly  used  with 

software 

10 

User  Interface 

2 0 

2 

100.00 

0.59 

11 

Data  Base  Interface 

0 0 

0 

0.00 

0.00 

12 

User  Requested  Changes 

Not  applicable  because 

not  yet 

any  other  users 

13 

Preset  Data  Base 

10  2 

12 

83.33 

3.52 

14 

Global  Variable/ 
Coirpool  Definition 

7 2 

9 

77.78 

2.64 

15 

Recurrent 

26  38 

64 

40.62 

18.77 

16 

Documentation 

Not  applicable  because 

such  errors  not  recorded 

17 

Requirements  Compliance 

8 0 

8 

100.00 

2.35 

18 

Operator 

0 0 

0 

0.00 

0.00 

19 

Questions 

Not  applicable  because  not  yet 

any  other  users 

20 

Unidentified 

0 0 

0 

0.00 

0.00 

TotcQs 

141  200 

341 

41.35 

100.00 

Table  G:  Statistics  Collector  Errors  from  1 October  1974  through  31 

December  1976,  Usinp  TW  Systems  Groip  Classification 
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I.  General  information  about  program 

A.  Number  of  lines 

B.  Nunber  of  tokens  vdth  and  without  define  expansion 

C.  Number  of  characters,  and  average  per  line 

D.  Number  of  oonments,  and  average  length 

E.  Number  of  define  directives,  and  average  length 

F.  Number  of  define  calls,  and  average  per  define  directive 


II.  Number  and  percentage  of  various  types  of  declarations 

A.  Single  item 

B.  Array,  including  breakdown  by  number  and  size  of  dimensions 

C.  Table,  including  breakdown  by  table  attributes 

D.  Overlay 

E.  File 

F.  Switch 

G.  Close 

H.  Program 

I.  Common 

J.  Monitor 

K.  Procedure  and  function 

L.  Breakdown  of  all  item  descriptions  by  type 

M.  Number  of  mode  directives 


III.  Summary  of  external  referenoes 

A.  Breakdown  of  ooirjool  resolved  names 

B.  External  closes  referenced 

C.  System  defined  procedures  and  functions 

D.  Mode  defined  sijiqjle  items 

E.  Simple  items  resolved  to  default  attributes 

F.  Percentage  of  identifiers  declared  implicitly 


IV.  Number  and  percentage  of  various  types  of  statements 

A.  Assignment,  including  breakdown  by  type  of  assignment 

B.  Exchange 

C.  GOTO 

D.  Return 

E.  Stop 

F.  Test 

G.  Procedure  oaLl 

H.  Input/output , including  breakdown  by  exact  nature 

I.  IF 


J 
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J.  IFEITH 

K.  FOR 

L.  Direct 


V.  Summary  information  about  statements 

A.  Total  number  of  siirple  and  conplex  executable  statements 

B.  Total  number  of  executable  and  non-executable  statements 

C.  I'ftjmber  of  ooirpound  statements 

D.  Number  of  statement  labels 


VI.  Information  about  arithnetic  expressions 

A.  Total  nunber 

B.  Total  number  of  tokens 

C.  Nifliiber  of  constants 

D.  Number  of  variables 

E.  Number  of  subscripted  variables 

F.  Ooxxnts  of  occurrences  of  various  operators 

G.  Counts  of  references  to  built-in  and  other  functicns 


Table  10:  Statistical  InfraTnation  about  Source  Program  Currently 

Available 
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